Search CORE

224 research outputs found

Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives

Author: Cummins Nicholas
Han Jing
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 21/09/2018
Field of study

Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. As a potentially crucial technique for the development of the next generation of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective computing and sentiment analysis. Various representative adversarial training algorithms are explained and discussed accordingly, aimed at tackling diverse challenges associated with emotional AI systems. Further, we highlight a range of potential future research directions. We expect that this overview will help facilitate the development of adversarial training for affective computing and sentiment analysis in both the academic and industrial communities

arXiv.org e-Print Archive

OPUS Augsburg

Learning Audio Sequence Representations for Acoustic Event Classification

Author: Han Jing
Liu Ding
Qian Kun
Schuller Björn W.
Zhang Zixing
Publication venue
Publication date: 27/07/2017
Field of study

Acoustic Event Classification (AEC) has become a significant task for machines to perceive the surrounding auditory scene. However, extracting effective representations that capture the underlying characteristics of the acoustic events is still challenging. Previous methods mainly focused on designing the audio features in a 'hand-crafted' manner. Interestingly, data-learnt features have been recently reported to show better performance. Up to now, these were only considered on the frame-level. In this paper, we propose an unsupervised learning framework to learn a vector representation of an audio sequence for AEC. This framework consists of a Recurrent Neural Network (RNN) encoder and a RNN decoder, which respectively transforms the variable-length audio sequence into a fixed-length vector and reconstructs the input sequence on the generated vector. After training the encoder-decoder, we feed the audio sequences to the encoder and then take the learnt vectors as the audio sequence representations. Compared with previous methods, the proposed method can not only deal with the problem of arbitrary-lengths of audio streams, but also learn the salient information of the sequence. Extensive evaluation on a large-size acoustic event database is performed, and the empirical results demonstrate that the learnt audio sequence representation yields a significant performance improvement by a large margin compared with other state-of-the-art hand-crafted sequence features for AEC

arXiv.org e-Print Archive

OPUS Augsburg

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Author: Geiger Jürgen
Jin Wenyu
Mousa Amr El-Desoky
Pohjalainen Jouni
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 01/01/2018
Field of study

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks

arXiv.org e-Print Archive

OPUS Augsburg

Emergent Communication in Interactive Sketch Question Answering

Author: Chen Siheng
Lei Zixing
Xiong Yuxin
Zhang Yiming
Publication venue
Publication date: 24/10/2023
Field of study

Vision-based emergent communication (EC) aims to learn to communicate through sketches and demystify the evolution of human communication. Ironically, previous works neglect multi-round interaction, which is indispensable in human communication. To fill this gap, we first introduce a novel Interactive Sketch Question Answering (ISQA) task, where two collaborative players are interacting through sketches to answer a question about an image in a multi-round manner. To accomplish this task, we design a new and efficient interactive EC system, which can achieve an effective balance among three evaluation factors, including the question answering accuracy, drawing complexity and human interpretability. Our experimental results including human evaluation demonstrate that multi-round interactive mechanism facilitates targeted and efficient communication between intelligent agents with decent human interpretability.Comment: Accepted by NeurIPS 202

arXiv.org e-Print Archive

On Rater Reliability and Agreement Based Dynamic Active Learning

Author: Adam Michael
Coutinho Eduardo
IEEE
Schuller Bjoern
Zhang Yue
Zhang Zixing
Publication venue
Publication date: 01/01/2015
Field of study

University of Liverpool Repository

OPUS Augsburg

Distributing Recognition in Computational Paralinguistics

Author: Coutinho Eduardo
Deng Jun
Schuller Bjoern
Zhang Zixing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

University of Liverpool Repository

OPUS Augsburg

Implicit fusion by joint audiovisual training for emotion recognition in mono modality

Author: Han Jing
Ren Zhao
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 01/01/2019
Field of study

A paper in ICASSP 201

OPUS Augsburg

Crossref

ZENODO

Exploring perception uncertainty for emotion recognition in dyadic conversation and music listening

Author: Han Jing
Ren Zhao
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 01/01/2020
Field of study

An article in Cognitive Computatio

OPUS Augsburg

ZENODO

Latency-Aware Collaborative Perception

Author: Chen Siheng
Hu Yue
Lei Zixing
Ren Shunli
Zhang Wenjun
Publication venue
Publication date: 25/07/2022
Field of study

Collaborative perception has recently shown great potential to improve perception capabilities over single-agent perception. Existing collaborative perception methods usually consider an ideal communication environment. However, in practice, the communication system inevitably suffers from latency issues, causing potential performance degradation and high risks in safety-critical applications, such as autonomous driving. To mitigate the effect caused by the inevitable latency, from a machine learning perspective, we present the first latency-aware collaborative perception system, which actively adapts asynchronous perceptual features from multiple agents to the same time stamp, promoting the robustness and effectiveness of collaboration. To achieve such a feature-level synchronization, we propose a novel latency compensation module, called SyncNet, which leverages feature-attention symbiotic estimation and time modulation techniques. Experiments results show that the proposed latency aware collaborative perception system with SyncNet can outperforms the state-of-the-art collaborative perception method by 15.6% in the communication latency scenario and keep collaborative perception being superior to single agent perception under severe latency.Comment: 14 pages, 11 figures, Accepted by European conference on computer vision, 202

arXiv.org e-Print Archive

Generating and protecting against adversarial attacks for deep speech-based emotion recognition models

Author: Baird Alice
Han Jing
Ren Zhao
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 01/01/2020
Field of study

A paper in ICASSP 202

OPUS Augsburg

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY